A Model for High-coverage Lexical Semantic Annotation Generation
نویسندگان
چکیده
AI applications often receive their input in the form of natural language text, or as the transcription of spoken text. A commonsense inference system should transform such input to a formal representation with limited vocabulary in order to be able to process them. In this paper, we present a method based on neural word embeddings that automatically assigns semic features to words of natural language. These features either describe the ontological category of a given word or provide some characterization or additional information. We show that our method has high coverage and performs well for English and Hungarian, and can easily be extended to other languages as well.
منابع مشابه
Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages
The last two decades have seen the development of various semantic lexical resources such as WordNet (Miller, 1995) and the USAS semantic lexicon (Rayson et al., 2004), which have played an important role in the areas of natural language processing and corpus-based studies. Recently, increasing efforts have been devoted to extending the semantic frameworks of existing lexical knowledge resource...
متن کاملCoarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study
“Lightweight” semantic annotation of text calls for a simple representation, ideally without requiring a semantic lexicon to achieve good coverage in the language and domain. In this paper, we repurpose WordNet’s supersense tags for annotation, developing specific guidelines for nominal expressions and applying them to Arabic Wikipedia articles in four topical domains. The resulting corpus has ...
متن کاملEvaluating Lexical Resources for a Semantic Tagger
Semantic lexical resources play an important part in both linguistic study and natural language engineering. In Lancaster, a large semantic lexical resource has been built over the past 14 years, which provides a knowledge base for the USAS semantic tagger. Capturing semantic lexicological theory and empirical lexical usage information extracted from corpora, the Lancaster semantic lexicon prov...
متن کاملDevelopment of the Multilingual Semantic Annotation System
This paper reports on our research to generate multilingual semantic lexical resources and develop multilingual semantic annotation software, which assigns each word in running text to a semantic category based on a lexical semantic classification scheme. Such tools have an important role in developing intelligent multilingual NLP, text mining and ICT systems. In this work, we aim to extend an ...
متن کاملData-Driven Learning in an Incremental Grammar Framework
Overview Incremental processing of both syntax and semantics, both in parsing and generation, is of significant interest for modelling the human language capability, and for building systems which interact with it. Formal linguistics has made significant contributions to this; one example is the framework Dynamic Syntax, which provides an inherently word-by-word incremental grammatical framewor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017